Next: Specifying Coding Systems, Previous: User-Chosen Coding Systems, Up: Coding Systems [Contents][Index]
This section describes variables that specify the default coding system for certain files or when running certain subprograms, and the function that I/O operations use to access them.
The idea of these variables is that you set them once and for
all to the defaults you want, and then do not change them again.
To specify a particular coding system for a particular operation
in a Lisp program, don’t change these variables; instead,
override them using coding-system-for-read and
coding-system-for-write (see Specifying
Coding Systems).
This variable is an alist of text patterns and
corresponding coding systems. Each element has the form
(regexp . coding-system);
a file whose first few kilobytes match regexp is
decoded with coding-system when its contents are
read into a buffer. The settings in this alist take priority
over coding: tags in the files and the contents
of file-coding-system-alist (see below). The
default value is set so that Emacs automatically recognizes
mail files in Babyl format and reads them with no code
conversions.
This variable is an alist that specifies the coding
systems to use for reading and writing particular files. Each
element has the form (pattern .
coding), where pattern is a
regular expression that matches certain file names. The
element applies to file names that match
pattern.
The CDR of the element, coding, should be either a coding system, a cons cell containing two coding systems, or a function name (a symbol with a function definition). If coding is a coding system, that coding system is used for both reading the file and writing it. If coding is a cons cell containing two coding systems, its CAR specifies the coding system for decoding, and its CDR specifies the coding system for encoding.
If coding is a function name, the function
should take one argument, a list of all arguments passed to
find-operation-coding-system. It must return a
coding system or a cons cell containing two coding systems.
This value has the same meaning as described above.
If coding (or what returned by the above
function) is undecided, the normal
code-detection is performed.
This variable is an alist that specifies the coding
systems to use for reading and writing particular files. Its
form is like that of file-coding-system-alist,
but, unlike the latter, this variable takes priority over any
coding: tags in the file.
This variable is an alist specifying which coding systems
to use for a subprocess, depending on which program is
running in the subprocess. It works like
file-coding-system-alist, except that
pattern is matched against the program name used
to start the subprocess. The coding system or systems
specified in this alist are used to initialize the coding
systems used for I/O to the subprocess, but you can specify
other coding systems later using
set-process-coding-system.
Warning: Coding systems such as
undecided, which determine the coding system from
the data, do not work entirely reliably with asynchronous
subprocess output. This is because Emacs handles asynchronous
subprocess output in batches, as it arrives. If the coding system
leaves the character code conversion unspecified, or leaves the
end-of-line conversion unspecified, Emacs must try to detect the
proper conversion from one batch at a time, and this does not
always work.
Therefore, with an asynchronous subprocess, if at all
possible, use a coding system which determines both the character
code conversion and the end of line conversion—that is, one
like latin-1-unix, rather than
undecided or latin-1.
This variable is an alist that specifies the coding system
to use for network streams. It works much like
file-coding-system-alist, with the difference
that the pattern in an element may be either a
port number or a regular expression. If it is a regular
expression, it is matched against the network service name
used to open the network stream.
This variable specifies the coding systems to use for subprocess (and network stream) input and output, when nothing else specifies what to do.
The value should be a cons cell of the form
(input-coding .
output-coding). Here
input-coding applies to input from the subprocess,
and output-coding applies to output to it.
This variable holds a list of functions that try to determine a coding system for a file based on its undecoded contents.
Each function in this list should be written to look at
text in the current buffer, but should not modify it in any
way. The buffer will contain undecoded text of parts of the
file. Each function should take one argument,
size, which tells it how many characters to look
at, starting from point. If the function succeeds in
determining a coding system for the file, it should return
that coding system. Otherwise, it should return
nil.
If a file has a ‘coding:’ tag, that takes precedence, so these functions won’t be called.
This function tries to determine a suitable coding system
for filename. It examines the buffer visiting the
named file, using the variables documented above in sequence,
until it finds a match for one of the rules specified by
these variables. It then returns a cons cell of the form
(coding . source), where
coding is the coding system to use and
source is a symbol, one of
auto-coding-alist,
auto-coding-regexp-alist, :coding,
or auto-coding-functions, indicating which one
supplied the matching rule. The value :coding
means the coding system was specified by the
coding: tag in the file (see
coding tag in The GNU Emacs Manual). The
order of looking for a matching rule is
auto-coding-alist first, then
auto-coding-regexp-alist, then the
coding: tag, and lastly
auto-coding-functions. If no matching rule was
found, the function returns nil.
The second argument size is the size of text,
in characters, following point. The function examines text
only within size characters after point. Normally,
the buffer should be positioned at the beginning when this
function is called, because one of the places for the
coding: tag is the first one or two lines of the
file; in that case, size should be the size of the
buffer.
This function returns a suitable coding system for file
filename. It uses find-auto-coding to
find the coding system. If no coding system could be
determined, the function returns nil. The
meaning of the argument size is like in
find-auto-coding.
This function returns the coding system to use (by default) for performing operation with arguments. The value has this form:
(decoding-system . encoding-system)
The first element, decoding-system, is the coding system to use for decoding (in case operation does decoding), and encoding-system is the coding system for encoding (in case operation does encoding).
The argument operation is a symbol; it should
be one of write-region,
start-process, call-process,
call-process-region,
insert-file-contents, or
open-network-stream. These are the names of the
Emacs I/O primitives that can do character code and eol
conversion.
The remaining arguments should be the same arguments that
might be given to the corresponding I/O primitive. Depending
on the primitive, one of those arguments is selected as the
target. For example, if operation does
file I/O, whichever argument specifies the file name is the
target. For subprocess primitives, the process name is the
target. For open-network-stream, the target is
the service name or port number.
Depending on operation, this function looks up
the target in file-coding-system-alist,
process-coding-system-alist, or
network-coding-system-alist. If the target is
found in the alist, find-operation-coding-system
returns its association in the alist; otherwise it returns
nil.
If operation is
insert-file-contents, the argument corresponding
to the target may be a cons cell of the form
(filename . buffer). In
that case, filename is a file name to look up in
file-coding-system-alist, and buffer
is a buffer that contains the file’s contents (not yet
decoded). If file-coding-system-alist specifies
a function to call for this file, and that function needs to
examine the file’s contents (as it usually does), it
should examine the contents of buffer instead of
reading the file.
Next: Specifying Coding Systems, Previous: User-Chosen Coding Systems, Up: Coding Systems [Contents][Index]